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Abstract. - We introduce a stochastic model to explain a double power-law distribution which 
exhibits two different Paretian behaviors in the upper and the lower tail and widely exists in social 
and economic systems. The model incorporates fitness consideration and noise fluctuation. We 
find that if the number of variables (e.g. the degree of nodes in complex networks or people's 
incomes) grows exponentially, normal distributed fitness coupled with exponentially increasing 
variable is responsible for the emergence of the double power-law distribution. Fluctuations do not 
change the result qualitatively but contribute to the second-part scaling exponent. The evolution 
of Chinese airline network is taken as an example to show a nice agreement with our stochastic 
model. 
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Qh| Introduction. — Power law behaviors are now perva- 
sive in various kinds of studies pTfM], which give an im- 
^Hportant class of complex networks, namely the scale-free 
^networks. However, in some cases, such a single property 
J^ is insufficient to describe the distributions in real- world 
systems in which scaling law is absent in some regions 
^vqor, even more peculiar, changed at some critical points 
|l5l[T6]. In contrast to the typical power law, distribution 
including two different power-law regions is called dou- 
ble power-law whose cumulative distribution, namely the 
probability that variable K is larger than a specific value 
^Z::, is given by [16]: 

'^1 nK > k) = iizi^lf- (1) 

_ ^ where 71 and 72 are two scaling exponents while kc is the 
turning point. This kind of property exists widely in social 
and economic systems such as the degree distribution of 
airline network, word network or scientist collaboration 
network and the distribution of people's incomes p!6H2Q] . 

Some works related to double power law concentrate on 
how to fit such distribution by a uniform function rather 
than treat two power separately. Non-extensive statistical 
theory and the combination of different power-law func- 
tions were applied to the problem [22j[23]. However these 
works cannot tell us how this nontrivial property comes to 
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its being. To understand its underlying mechanism. Reed 
proposed a model based on geometric Brownian motion 
[W. He proved that such process coupled with exponen- 
tial distributed evolution time causes, as he called, a dou- 
ble Pareto-lognormal distribution which has a lognormal 
body but power-law behaviour in both tails. This distri- 
bution is shown to provide an excellent fit to observed data 
on incomes and earnings. Dorogovtsev and Mendes pro- 
posed another different model aiming to explain the double 
power-law degree distribution in word network [24]. The 
model is constructed by two mechanisms: the preferen- 
tial attachment and the creation of new links between old 
nodes which increases with evolution time. By continues 
approach, they show the degree distribution has two dif- 
ferent scaling exponents, —1.5 for upper tail and —3 for 
lower tail. 

Although the above two models can explain incomes 
and word network respectively, both of them have lim- 
itations. In word network model the scaling exponents 
are fixed. Thus it cannot explain the distributions with 
distinct scaling exponents. While in Reed's model, the rel- 
ative increase rate of incomes is assumed to be the same 
for all people. This is far from our knowledge that persons 
have heterogeneous ability of making money. Therefore 
fitness character must be taken into account to general- 
ize the model. Besides, although the previous models re- 
produced some characters, the evolution of the real-world 
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systems have not been investigated to support their model 
assumptions. 

In this Letter, a general stochastic model is developed 
to explain the double power-law distribution. The model 
incorporates fitness consideration and noise fluctuation, 
which is general to describe many real-world system evo- 
lution. We find normal distributed fitness coupled with 
exponentially increasing variables is responsible for the 
emergence of the double power-law distribution while fluc- 
tuation does not change the result qualitatively but con- 
tribute to the scaling exponent. We also investigate the 
evolution of CAN to provide evidence for the proposed 
model. 

Generalized model for double power law. — Let's 
denote ki(t^t') the value at time t of the i-ih. variable 
which comes into the system at time t' and denote N{t) 
the number of the total variables in the system at time 
t. Regardless of the specific meaning of ki{t^t')^ its evo- 
lution pattern usually shares common features. For an 
example, the increasing rate of hi is proportional to hi 
itself. This is probably caused by the preferential attach- 
ment (also called the rich get richer) that widely exists in 
self-organized complex systems 0]. Besides, it is natural 
to assume that the increasing rate is proportional to some 
of its own attributes which is called fitness, denoted as rn 
[25] . In reality fitness can be interpreted as, for exam- 
ples, capital, social skills, activity levels of individuals and 
population or Gross Domestic Product (GDP) of cities. 
The increasing rate can also be influenced by other ingre- 
dients which can be normalized to be a time-dependent 
factors. But as the first step, let us focus on the simplest 
case where such factors are treated as constants. In this 
context, the evolution equation of ki{t^t') is given by: 



dki 
~dt 



(2) 



Thus ki grows exponentially as ki{t^t') = e^^^^ 



^'^ (assum- 
ing ki{t\t') = 1). This directly restricts the form of N{t) 
in some systems such as node degree evolution in a net- 
work. Limited by its structure, the exponentially growing 
degree ki requires the exponentially growing N{t) (or even 
faster). Although in some cases such as people's incomes 
N{t) does not encounter this problem, it has also been 
assumed to increase exponentially jl9j. Therefore we as- 
sume 

N{t) (xe"-^ (3) 

The distribution of fitness rji is critical in our model. As 
we will analyze, normal distributed rji is essential to pro- 
duce the double power-law distribution. Note that when 
we choose the specific parameters for r]i in a network- 
structured system, we meet with the similar problem to 
N{t). Since degree of a node can not exceed n(t) — 1, to 
keep a long time evolution rji should be restricted so that 
the mean value of r]i , denoted as jjjjj , cannot be far larger 
than Cn while the standard variance, denoted as cr^, should 
be bounded properly. 



Now let us turn to analyze the distribution un- 
der the above-mentioned condition. Using equation 
p{ki{t^t'))dki{t^t') = f{r]i)dr]i, the distribution of ki{t^t') 
is easily derived to follow a lognormal distribution. Since 
the variables are added exponentially, their lifetime, de- 
fined as T = t — t'^ follows exponential distribution. There- 
fore the value of ki we actually examined is lognormal 
kiir) mixed by exponentially distributed T. Thus the 
distribution reads: 



p{k) 



1 
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where tc is the time when we examine p{k). If 0, 
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) and Eq.(4) gives: 



(5) 



where ({k) = {q' ^^q. The variables ki follow a power-law 
distribution with a cut-off at kc = e^^^^ which is the max- 
imum value. On the other hand 
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if (T^ ^ oo and the integral becomes k independent, indi- 
cating a uniform distribution. However this never happens 
in a network-structured system since is limited as we 
have discussed. 

For a finite > 0, it is difficult to derive analytical 
result from Eq.(4). Therefore numerical experiments are 
applied to analyze the problem. The simulation is car- 
ried out by the following instructions. At each time step 
new variables increasing exponentially are added by initial 
value 1 and are assigned fitness chosen from a normal dis- 
tribution. Then each variable increases its value according 
to Eq.(2). We simulate the cumulative distribution for dif- 
ferent cr^ as shown in Fig. [H When = 0, the distribu- 
tion follows a power-law form with a cut-off at e^^^^, as we 
have discussed above. With the increase of cr^y, the second 
part of the distributions decreases more and more slowly 
while their shapes seem to be a power law. It is noteworthy 
that the turning point occurs at about kc = e^^^^ which 
is exact the point at which the cut-off occurs for = 0. 
Therefore the turning point is expected to increase with 
the evolution time tc as kc ~ e^^^^, as well demonstrated 
in Fig. H 

Another interesting discovery is that with the increase 
of <j^, the first part of the distributions does not change 
significantly. From Fig. (TJ we see that all the first part of 
the curves superpose the distribution of cr^y = well. Thus 
the first part ofp{k) must follow the same power-law distri- 
bution as the p(k) of cr„ = with exponent 71 = — which 
is independent of cr^. However p{k) will finally become 
uniform distribution when ^ 00. There must have a 
transition point atr at which the first part of the distri- 
bution starts to change. By extensive numerical experi- 
ments, the transition point is determined to be atr ^ 
The distributions of > atr are not studied since we only 
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Fig. 1: Simulation of the cumulative distribution of variable ki. 
The simulation is carried out with fir^ = 0.15, N{t) = SOe^ o^^* 
and tc = 30. For other fir] and exponential increasing N(t), similar 
results can be obtained. The study is averaged over 50 realizations. 



Fig. 2: Simulation of the cumulative distribution of ki at different 
tc. The simulation is carried out with fij] = 0.15, N{t) = 50e°-°^^* 
and = 0.02. Inset: the correlation of kc and tc- The fitted 
line (solid line) represents y = 0.6e^-^^^ , which illustrates that the 
turning point kc increases with tc as kc ~ e^^*^. 



concern small a^. When = atr, the second part of the 



distribution, as seen in Fig. (TJ still decreases faster than 
k . This result indicates that for all the < a^r, the 
second part of the distributions has an upper bound. 

Now let us prove that the second part of p{k) has a 
lower bound. It is easy to examine that when \n{k) > 



ln(t) 
t-1 



- IJinYi the following inequality is valid: 
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Since cr^^ is usually very small, the above inequality is 
approximately considered to be valid when ln(/c) > jir^T. 
Therefore for any k > e^^^^^ (namely the second part of 
p{k). The following discussion is restricted to this con- 
dition), the integral of the left term in Eq.(6) from to 
tc must be larger than that of the right term. The inte- 
gral of the left term is exactly the degree distribution p{k) 
while the integral of the right term, according to Ref. [10] , 
follows asymptotically a power-law function with the ex- 
ponents related to /i^, and Cn- Thus for a specific group 
of above exponents, p{k) has a lower bound. 

If p{k) is not oscillatory, the existence of the upper and 
the lower bound allows ^"fffe^ to have a hmitation. As- 
suming it to be —7, then p{k) is written asp(/c) oc l{k)k~^ ^ 
where l{k) ~ o{k^) and k~^ ~ o{l{k)) are valid for any 
/3 > when k ^ 00. Therefore for large /c, given any 
constant u > 1, we have the following inequality: 



{uk)-^ ^ l(uk) 



k-P 



m 
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Let /3 ^ 0, then we have 



lim 



l(uk) 
W 



1. 



(8) 



Note that Eq.(8) is also valid for any constant {) < u <1. 
This property of l(k) follows directly from the requirement 
that p{k) is asymptotically scale invariant. Thus, the form 
of l(k) only controls the finite extent of the lower tail and 
will not affect its scaling exponent significantly. So the 
second part of the p{k) is also power law. For further 
study, numerical simulation is applied to determine the 
exponent of the second part of denoted as 72. It is 
found that 

72 — . (9) 

The result is well demonstrated by the simulation as shown 
in Fig. [3l It is noteworthy that ^ leads to 72 ^ 00. 
The second part of p{k) degenerates naturally to be a cut- 
off. The exact formulation of 72 may also be related to 
and iir] , but extensive simulations indicate that 72 is much 
less sensitive to Cn(or /i^) than to parameter a^. 

So far we have provided a possible model to produce 
double power-law distribution. However real complex sys- 
tems such as the Internet or WWW usually include fiuc- 
tuations that may be essential to describe the dynamics 
of its evolution [T0l[2T]. Therefore, a general model must 
be able to describe this feature. The generalization can be 
carried out by modifying Eq.(2): 



dki = priikidt + (1 — p){fir]dt + adw)ki 



(10) 



where dw is white noise of standard normal distribution 
^"^^ and a is the standard variance of fiuctuations. p G [0, 1] is 
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Fig. 3: Numerical studies of 72. The evolution time tc is selected 
properly to keep the number of variables at about 1000 while the 
initial number of variables equals to 50. The fitted lines (solid line 
for i^rj = 0.15, Cn = 0.088 and dashed line for /j^n = 0.18, Cn = 0.1) 
is given by 72 = 0.46+ and 72 = 0.44+ respectively. The 

study is averaged over 50 realizations. For other /iry and Cn, similar 
results can be obtained. 



Fig. 4: Numerical studies of the cumulative distribution for different 
p. The simulation is carried out with = 0.17, N{t) = 50e^-^^^^ 
and tc = 30. 'x' represents = a = and the others represent 
fjry = a = 0.02. When ar] = cr = the distribution is a power law 
independent of p. When = cr > 0, the first-part scaling exponent 
does not change while the second-part exponent decreases with p. 
The study is averaged over 50 realizations. 



a parameter representing the relative contribution of the 
noise and the fitness, which can be considered as a measure 
of the degree of the disorder in real-world systems. Note 
that if p = 1, Eq.(lO) becomes Eq.(2) while if p = 0, it 
leads to the geometric Brownian motion. Fluctuation de- 
scribed by the term akidw is based on a general fact that 
in real systems such as WWW, site with large number of 
connections are likely to lose or gain more links than the 
site with small one. The solution of Eq.(lO) is solved to be 
gpr7itg(i-p)[(/x^-o.5cr )t+crw] ^j^ich follows loguormal distri- 
bution with logarithmic mean p/j^rji + (1 — p){l^r] — 0.5<j^)t 
and logarithmic variance (pcr^t)^ + (l — p)^cr^t. By the sim- 
ilar methods used above, one can verify the distribution 
is still a double power law but the second scaling expo- 
nent is controlled by parameter p. In Fig. H] we show the 
cumulative distribution for different p. It is found that 
the first part power-law behavior is still independent of 
p, leading to 71 = — , but the second power-law expo- 
nent decreases with p. Therefore noise fluctuations do not 
change the distribution qualitatively but contribute to the 
second scaling exponent. 

The present model (Eq.(lO)) indicates that evolution of 
a complex system may be characterized by two parts: a 
leading ingredient influencing the evolution and noise fluc- 
tuations. This will be interpreted as follow. Since there 
are usually various ingredients related to the evolution of 
complex system, practically we cannot take all of them 
into account. A feasible method is to consider the most 
important ingredient as the fitness while all other minor 
ones as contributions to fluctuations. Then parameter p 
represents how much the leading ingredient contributes to 



the evolution. If the evolution is totally governed by the 
leading ingredient, then p ^ 1, indicating a deterministic 
pattern. On the other hand, if there is no apparent leading 
ingredient, it leads to p ^ 0, indicating a random picture. 
Therefore our model is general to describe various real- 
world systems which evolve between order and disorder, 
and provide a better understanding on their evolution. As 
we will see in the following section, CAN is a typical ex- 
ample that follows such an evolution mechanism. 

An example: Evolution of CAN. — In this section, 
we will analyze the evolution of CAN to provide evidence 
for our proposed model. 

Chinese airline system can be modeled as a complex 
network with cities representing nodes and flights repre- 
senting edges. The degree of a node is defined as the sum 
of the airlines connecting to it. The degree distribution 
of CAN has been investigated by several studies which 
all indicate a double power-law behavior [T6l[T7l[22] . Here 
we will report some useful information. First we have ana- 
lyzed the total number of nodes N{t) existing at time t. It 
grows as N{t) oc e^^^ with Cn ^ 0.088, which is consistent 
with our assumption that nodes increases exponentially. 
The number of edges M{t) also increases exponentially 
with time as M{t) ex e^"^^ with Cm ^ 0.154. We have 
also measured the parameters of the degree distribution 
of CAN from 1999 to 2003 and a single year 2008, sum- 
marized in Table. [H The exponent 71 stabilizes at about 
0.51 while 72 fluctuates from 2.1 to 2.7. Note that the 
stabilization of 71 is indicated by our model where 71 is 
independent of fluctuations. The turning point kc shows 
an increase from 18 to 30, which is also consistent with 



p-4 



Emergence of double scaling law in complex system 



Table 1: Two scaling exponents (71 and 72) and the turning 
point (kc) of cumulative degree distribution in CAN 





1999 


2000 


2001 


2002 


2003 


2008 


71 


0.46 


0.51 


0.52 


0.51 


0.51 


0.51 


72 


2.2 


2.1 


2.5 


2.3 


2.7 


2.7 


kc 


18 


18 


20 


21 


22 


30 



our result that the turning point increases with evolution 
time. 

In the present paper the evolution of CAN is studied 
by investigating the correlation between GDP and degree. 
The economic growth such as the size of tertiary industry 
has been recently demonstrated to be a leading ingredient 
in shaping the topology of CAN According to the 

discussion in the last section, we consider GDP relates to 
the fitness of the corresponding nodes. Note that the evo- 
lution of degree can also be studied by directly measuring 
the logarithmic ratio of the degree of successive two years. 
But this method can neither help to distinguish the iden- 
tity of the fitness nor provide useful information of the 
corresponding parameters which is important in our anal- 
ysis. As shown in Fig. [5l we found that the degree forms 
a linear function with its corresponding GDP {R^ > 0.62) 
while the fluctuations are obvious. Despite the continu- 
ing evolution of both GDP and degree, this correlation 
has maintained for at least six years (1998 — 2003) since it 
has been first observed in 1998. Therefore it provides some 
key information about the evolution of degree in CAN and 
cannot be viewed as just a coincidence. 

Considering the time evolution, the correlation can be 
described as: 

h{t)=D{t)G^{t), (11) 

where Gi{t) is the GDP of city i at year t. It grows expo- 
nentially as Gi{t) (X e^^^ where follows normal distribu- 
tion with the mean of 0.18 and the standard variance of 
0.02. The strong positive correlation confirms that econ- 
omy may govern the evolution of the degree in CAN while 
the fluctuations, as we mentioned previously, are consid- 
ered to result from some minor ingredients (such as pop- 
ulation density, public administration, geographical con- 
straints, etc). Both the two aspects contribute to D{t), 
leading to an expression given by D{t) = e«(^)e^(^), where 
term e^^^^ is the time-dependent slope and the term e^^^^ 
represents the fluctuations. 

The specific form of a{t) is easy to be evaluated. Sum- 
ming Eq.(ll) for all nodes we get {D{t)) = e^^^^ = 

2M(£}_^ - ^Y^ere EiGiit) is measured to be 

proportional to e^-^^K Thus a{t) = -0.036t. Then 
s{t) can be investigated from data by calculating e{t) = 
ln(^:^) — a{t). However what we concern here is the 
increment of e{t)^ defined as de{t) = e{t) — e{t — 1). In 
Fig. [6] we plot the distribution of deit) for all the five 
years. It follows a normal distribution with the mean 
of and the standard variance of 0.09. Furthermore we 




Fig. 5: The correlation of degree and its corresponding GDP for year 
1999 and year 2002. Both of them exhibit hnear correlation. {Gi(t))]^ 
is an average over all the nodes with the same degree k. Note that 
the data of the two years are not drawn for better visualization. 
(Gi(1999) here increases 2000 from the origin data.) The correlation 
does not change from 1998 to 2003. 



calculate the self-correlation function of de{t)^ defined as 
{de{t)de{t + r)) = dei{t)dei{t + r) (r is the time 

interval). As listed in Table. [2l it exhibits very weak 
correlation (correlation coefficient < 0.1) when r 7^ 0. 
Therefore de{t) can be regarded as white noise and ex- 
pressed as de{t) = 0.09dw. Then D{t) is written as 



^ e 



0.09w-0.036t 



. Substituting it into Eq.(ll) and ap- 



plying differentiation we have [27] 

dki = {Xi - 0.036)kidt + 0.09^;^^^. 



(12) 



Eq.(12) is exact the form of our model which can give rise 
to double power-law distribution. To further demonstrate 
its agreement with the real evolution of CAN, we simulate 
the degree distribution according to Eq.(12). We obtained 
7i = 0.61 (it can also be calculated from 71 = -^ = 
(A )-o 036 ~ 0-61), comparable to the first exponent 0.51 
while 72 = 2.84, good agreement with the second exponent 
2.7. 

Conclusion. — We have proposed a general model to 
explain the emergence of the double power-law distribu- 
tion. The model incorporates fitness consideration and 
noise fluctuation which indicates that evolution of a com- 
plex system may be characterized by two parts: a leading 
ingredient and noise fluctuations. We find that normal 
distributed fitness coupled with exponentially increasing 
variables is responsible for the emergence of the double 
power-law distribution. Fluctuations do not change the 
result qualitatively but contribute to the value of scal- 
ing exponent. We have also studied empirically the CAN 
which turns out to follow the same evolution pattern as 
our proposed model. 

We have only discussed the behavior of our model when 
cr^ < If cr^ is not much larger than the distri- 
bution still decays like a double power-law but both the 
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30 




ds 

Fig. 6: The distribution of the increment of e{t) for all the five 
years. The data is binned into classes. The fitted line is p{d£) = 

r_idef_. 

19. 4e 2*0.092 . Therefore it follows normal distribution with mean 
and standard variance 0.09. Inset: The cumulative distribution 
of d£(t). It is well fitted by P(d£(t) > de) = 116erfc(-^^) where 

erfc{x) = e-^ dx. 

exponents are different from previous ones. With the con- 
tinuing increasing of a^, the double power-law behavior 
turns out to be unconspicuous. It results from that the 
second scaling exponent is gradually close to the first one 
and finally becomes indistinguishable. 

Finally, we would like to mention that we have done 
tests for six usual distributions, namely exponential distri- 
bution, uniform distribution, power-law distribution, Pois- 
son distribution, Rayleigh distribution and Weibull distri- 
bution for the fitness instead of normal distribution, the 
results show that none of them is able to achieve double 
power-law distribution. Therefore we believe that normal 
distribution of fitness is a key ingredient responsible for 
the double power-law distribution. 
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